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Abstract 

Network  management  represents  an  architectural  gap  in  to¬ 
day’s  Internet  [1].  Many  problems  with  computer  networks 
today,  such  as  faults,  misconfiguration,  performance  degra¬ 
dation,  etc.,  are  due  to  insufficient  support  for  network  man¬ 
agement,  and  the  problem  takes  on  additional  dimensions 
with  the  emerging  programmable  router  paradigm.  The  In¬ 
ternet  Network  Management  Workshop  is  working  to  build  a 
community  of  researchers  interested  in  solving  the  challenges 
of  network  management  via  a  combination  of  bottoms-up 
analysis  of  data  from  existing  networks  and  a  top-down  de¬ 
sign  of  new  architectures  and  approaches  driven  by  that 
data.  This  editorial  sets  out  some  of  the  research  challenges 
we  see  facing  network  management,  and  calls  for  participa¬ 
tion  in  working  to  solve  them. 

Categories  and  Subject  Descriptors:  C.2.3  [Computer- 
Communication  Network]:  Network  Operations 

General  Terms:  Management 

Keywords:  Network  &  service  management,  Internet  Net¬ 
work  Management  Workshop,  research  questions 


1.  INTRODUCTION 

In  many  ways,  computer  network  management  remains 
the  least  understood  aspect  of  computer  networking.  There 
is  a  lack  of  well-established  principles  guiding  the  design  of 
networks  for  manageability.  There  is  also  a  lack  of  scientific 
understanding  of  the  evolution  of  network  state  in  real  oper¬ 
ational  environments.  Commercially,  given  the  huge  dollar 
values  involved  in  managing  networks,  a  number  of  compa¬ 
nies  have  created  products  to  help  network  operators  gain 
visibility  over  and  manage  the  behavior  of  their  networks. 
However,  none  of  these  products  come  even  close  to  address¬ 
ing  the  challenges,  and  their  users  remain  deeply  unsatisfied 
with  the  results,  despite  significant  investment  to  develop 
and  deploy  such  products. 

The  difficulties  of  network  management  range  from  the 
mundane  to  the  failure  of  some  of  the  best  “organizing  prin¬ 
ciples”  of  networking.  For  example,  network  devices  are  too 
often  still  managed  one  box  at  a  time  rather  than  man¬ 
aged  as  an  integral  networked  system,  despite  the  fact  that 
the  correct  operation  of  a  network  requires  related  updates 
to  two  devices  to  be  either  both  accepted  or  both  rejected 
with  a  transaction-like  semantic.  Underlying  this  somewhat 


mundane  example  is  a  more  fundamental  challenge:  ensur¬ 
ing  the  consistency  of  the  vast  network  state.  As  an  or¬ 
ganizing  principle,  “soft  state”  has  proven  to  be  extremely 
valuable  for  achieving  the  eventual  consistency  of  protocol 
state  in  many  individual  network  protocols.  However,  given 
the  vast  amount  of  state  that  needs  to  be  managed  in  each 
network  device,  relying  on  the  soft  state  principle  alone  may 
not  be  enough. 

The  Internet  Network  Management  Workshop  (INM)  is 
a  relatively  young  venue  created  to  foster  state-of-the-art 
research  in  network  management.  INM  differs  from  other 
venues  in  that  its  focus  is  on  tackling  fundamental  network 
management  problems  with  a  combination  of  a  bottom-up 
analysis  of  data  from  existing  network  systems  and  a  top- 
down  design  of  new  systems  grounded  in  the  theories  de¬ 
rived  from  the  bottom-up  analysis.  The  INM  workshop 
seeks  to  elevate  participants’  collective  experience  with  op¬ 
erational  IP  networks  into  concepts,  principles,  and  theories 
that  can  be  leveraged  in  today’s  networks  and  carried  for¬ 
ward  into  clean-slate  designs  that  intrinsically  support  man¬ 
agement  rather  than  treating  management  as  a  bolted-on 
after-thought. 

In  this  editorial,  we  articulate  some  of  the  research  issues 
that  motivated  us  to  start  the  INM  workshop  and  discuss  a 
few  possible  research  directions. 


2.  RESEARCH  CHALLENGES 

New  Environments,  New  Challenges. 

These  are  exciting  times  for  network  management  research 
as  new  programmable  router  paradigms  are  dramatically  in¬ 
creasing  the  potential  for  control  over  the  network  elements. 
Platforms  such  as  OpenFlow  [9]  and  software  routers  [2,  6, 
4,  7,  11]  create  the  need  for  new  network  abstractions  and 
mechanisms  to  manage  these  systems  and  tie  them  into  se¬ 
cure  networks  that  operate  with  high  reliability.  While  ex¬ 
isting  networking  systems  have  drawbacks,  many  bugs  have 
been  driven  from  implementations  over  decades  of  use  and 
refinement.  Time  frames  on  that  scale  are  not  affordable 
for  new  systems.  Since  all  new  systems  have  new  bugs,  the 
community  must  challenge  itself  to  create  new  classes  of  de¬ 
bugging  tools  that  are  compatible  with  and  take  the  full 
advantage  of  the  new  abstractions.  Some  of  the  new  plat¬ 
forms  host  the  control  plane  in  a  small  number  of  servers: 
we  also  need  algorithms  for  improving  the  network’s  ability 
to  survive  the  misbehavior  of  these  critical  components. 
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The  emerging  cloud  computing  and  data  center  infras¬ 
tructures  face  increasing  user  expectations  for  reliability  and 
performance,  yet  they  present  significant  network  and  ser¬ 
vice  management  challenges  as  a  result  of  their  scale  and 
complexity.  In  these  settings,  the  sheer  number  of  servers 
and  network  elements  under  management  are  leading  to  new 
and  fresh  ideas  for  large  scale  systems  management  [5,  3], 
protecting  the  integrity  of  computation  and  data  even  in 
the  face  of  numerous  failure  and  unpredictable  workloads. 
INM  is  interested  in  the  role  that  networking  should  play 
in  making  data  centers  economical,  which  includes  issues  of 
agility  (e.g.,  giving  each  data  center  tenant  the  illusion  it 
has  its  own  private  expandable  data  center  or  data  centers), 
resource  usage  (e.g.,  ensuring  each  tenant  obtains  the  per¬ 
formance  and  resources  it  needs),  and  manageability  (e.g., 
ensuring  that  few  humans  are  able  to  operate  very  large  data 
centers  while  minimizing  down  time). 

In  addition,  the  emergence  of  overlay  networks  as  a  vir¬ 
tual  network  on  the  top  of  the  Internet  introduces  new  types 
of  operational  and  management  problems  such  as  multi-level 
failures  propagation  due  to  interactions  between  overlay  and 
underlay,  and  interdomain  policy  violation  due  to  bypassing 
service  providers  rules.  Unlike  native  IP  networks,  overlays 
have  dynamic  connectivity  that  makes  monitoring,  correlat¬ 
ing  and  diagnosing  these  problems  much  harder. 

Scalable  Yet  Rigorous  State  Management. 

Much  of  network  management  is  centered  on  managing 
the  distributed  state  that  are  necessary  to  implement  a  ser¬ 
vice  or  to  monitor  the  health  of  the  network.  Therefore,  re¬ 
search  ought  to  take  a  serious  look  at  the  challenges  involved 
in  maintaining  the  consistency  of  network  state.  Tradition¬ 
ally,  network  protocols  have  relied  heavily  on  the  notion 
of  periodically  refreshed  soft-state  to  achieve  and  maintain 
eventual  state  consistency.  However,  with  the  state  explo¬ 
sion  and  accompanying  increase  in  network  service  complex¬ 
ity,  making  every  piece  of  state  soft  may  not  be  feasible.  One 
potential  direction  is  to  consider  other  points  in  the  space 
of  consistency  models,  from  looser  partial  consistency  to 
stronger  atomic,  transactional  consistency  semantics.  Even 
in  situations  where  these  concepts  seem  particularly  helpful 
and  natural,  such  as  network  configuration,  change  manage¬ 
ment,  and  on-line  network  debugging,  they  have  not  been 
well-explored. 

Management  Friendly  Protocols  and  Data-Plane 
Primitives. 

There  is  already  a  huge  number  of  existing  mechanisms 
used  in  various  aspects  of  network  management.  There  is 
an  acronym  soup  of  routing,  signaling,  QoS  and  virtual¬ 
ization  protocols;  however,  they  must  be  knitted  together 
to  provide  network  services,  and  this  is  where  innovation 
is  needed.  To  reduce  the  complexity  of  network  manage¬ 
ment  we  argue  for  characterizing  the  behavior  of  these  ex¬ 
isting  mechanisms,  and  then  creating  useful  abstractions  for 
them.  INM  is  interested  in  new  methods  for  design  and 
modeling  of  control  plane  protocols  and  data-plane  primi¬ 
tives.  Relevant  questions  include:  What  prerequisites  are 
required  by  a  mechanism?  What  invariants  does  a  mecha¬ 
nism  maintain?  What  performance  guarantees  does  a  mech¬ 
anism  make?  What  metric  to  use  to  asses  both  the  perfor¬ 
mance  and  cost  (e.g.,  complexity)  of  such  a  mechanism? 
What  are  the  possible  failure  modes  of  a  mechanism?  What 


are  the  supported  methods  of  recovery?  Answers  to  these 
questions  will  inform  network  management  systems  to  max¬ 
imize  potential  automation  in  service  provisioning,  network 
monitoring,  failure  diagnostic,  and  failure  recovery. 

Going  further,  future  mechanisms  could  also  be  designed 
to  explicitly  support  interactions  and  coordination  with  other 
mechanisms.  Most  of  the  existing  mechanisms  are  designed 
for  a  specific  isolated  purpose.  They  rarely  expose  any  pro¬ 
grammatic  interfaces.  Yet,  in  practice,  multiple  mechanisms 
often  need  to  work  in  conjunction  to  realize  the  network’s 
objectives.  As  a  result,  todays  networks  often  resort  to  cus¬ 
tomized  glue  logic  and  hacks  to  integrate  mechanisms  that 
do  not  have  explicit  interfaces  for  interactions.  One  such  ex¬ 
ample  is  using  route-redistribution  for  gluing  together  multi¬ 
ple  routing  protocols  [8].  Configuration  hacks  like  these  are 
responsible  for  many  outages,  as  they  substantially  increase 
the  complexity  of  the  system  and  make  it  much  harder  to  au¬ 
tomate  network  management.  Therefore,  in  designing  new 
mechanisms,  it  is  worth  considering  what  explicit  program¬ 
matic  interfaces  should  be  supported  to  allow  component 
integration  to  be  seamless.  An  added  advantage  of  defining 
these  interfaces  is  the  potential  to  enhance  the  network  per¬ 
formance  via  joint  optimization  of  the  parameter  settings 
of  multiple  mechanisms.  For  example,  more  efficient  reach¬ 
ability  control  may  be  possible  by  jointly  optimizing  the 
configurations  of  packet  filters  and  routing  protocols  [10]. 
Such  interface  should  play  a  key  role  to  achieve  a  balance 
between  multiple  trade-offs  (i.e,  constraints)  in  accomplish¬ 
ing  the  goal  such  as  intrusiveness  vs.  accuracy,  usability  vs. 
risk,  that  might  be  infeasible  to  achieve  based  on  a  view  of 
a  single  system. 

Testbeds,  Data,  and  Evaluation  Methods. 

One  barrier  to  improving  the  impact  and  quality  of  net¬ 
work  management  research  has  been  the  lack  of  publicly 
available,  minimally  sanitized  network  configuration  data, 
traffic  traces,  and  operational  experience  data  (e.g.,  outage 
and  error  information).  INM  solicits  research  tools  for  cre¬ 
ating,  sanitizing,  sharing  these  types  of  data  broadly  among 
the  research  community.  By  working  together,  we  are  find¬ 
ing  that  our  individual  successes  at  obtaining  access  to  data 
and  configuration  can  be  leveraged  into  greater  benefits  for 
all. 

Another  factor  slowing  down  the  progress  of  network  man¬ 
agement  research  is  the  primitive  state  of  the  scientific  meth¬ 
ods  available  for  studying  network  management  problems 
and  for  evaluating  solutions.  We  need  both  formal  meth¬ 
ods  for  analyzing  network  management  systems  and  equally 
important,  a  set  of  benchmarks  including  performance  met¬ 
rics  and  complexity  metrics  for  comparing  network  man¬ 
agement  solutions.  Experimental  evaluation  is  also  closely 
tied  to  data  availability,  as  compelling  benchmarks  need  to 
be  based  on  realistic  workflows,  real  trace  derived  tests,  re¬ 
alistic  reconfiguration  tasks,  realistic  offered  workload,  etc. 
Environments  like  Emulab  (and  GENI  in  the  future)  pro¬ 
vide  the  beginnings  of  testbeds  for  network  management, 
but  more  needs  to  be  done  on  simulator  support  for  net¬ 
work  management  research,  as  well  as  realistic  testbeds  for 
repeatable  emulation.  A  mature  set  of  accessible  scientific 
methodologies  will  be  a  great  catalyst  for  accelerating  inno¬ 
vative  research  in  network  management. 
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Leveraging  Both  Static  and  Dynamic  Verification. 

Modern  network  systems  are  highly  complex  and  prone  to 
bugs  and  other  vulnerabilities.  Software  bugs  could  cause, 
for  example,  a  router  to  mis-process  packets.  Software  vul¬ 
nerabilities  could  be  exploited  to  allow  an  attacker  to  in¬ 
ject  misbehaviors  that  undermine  the  network’s  functions. 
Therefore,  even  if  the  network  management  system  correctly 
manages  a  network’s  state  under  normal  conditions,  the  run¬ 
time  behavior  of  the  network  may  be  degraded  or  incorrect 
under  adverse  conditions.  It  is  therefore  useful  to  begin 
thinking  about  how  to  architect  a  network  so  that  it  has 
built-in  support  for  operational  correctness  verification.  The 
challenging  issues  to  address  are:  What  configuration  mod¬ 
els  can  support  global  and  scalable  static  verification  analy¬ 
sis?  How  to  compare  and  rank  different  network  configura¬ 
tions?  How  to  instrument,  observe  and  debug  the  network 
in  timely  and  synchronized  manner?  How  should  data  and 
control  plane  behavior  be  reliably  and  securely  monitored 
and  reported?  How  could  a  network  management  system 
automatically  derive  the  reference  correct  behavior  at  de¬ 
vice  level?  How  can  a  massive  amount  of  monitored  data  be 
processed  to  verify  correctness? 

3.  PARTICIPATION 

We  are  looking  for  others  who  want  to  tackle  these  chal¬ 
lenges  and  join  the  conversation.  The  next  INM  workshop 
will  be  co-located  with  NSDI  2010  in  San  Jose,  CA.  INM 
is  seeking  case  studies,  experimental  results,  position  pa¬ 
pers,  as  well  as  provocative  ideas  and  clean-slate  designs. 
Paper  registrations  are  due  on  November  30,  and  the  sub¬ 
mission  deadline  is  December  7.  The  full  call  for  papers  is 
at  http://www.usenix.org/events/inmlO. 
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